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APPARATUS AND METHOD OF ADDITIVE SYNTHESIS OF DIGITAL 
AUDIO SIGNALS USING A RECURSIVE DIGITAL OSCILLATOR 

BRIEF DESCRIPTION OF THE INVENTION 

This invention relates generally to the processing of digital audio signals. 
More particularly, this invention relates to a technique for additive synthesis of digital 
audio signals using a recursive digital oscillator. 



Additive synthesis is a signal synthesis technique based on the Fourier 
Theorem. This theorem states any signal can be decomposed into a set of constituent 
sine waves, and that the sum of the constituents will reconstitute the original. Additive 
10 synthesis is classified as a receiver-based synthesis algorithm, but differs from 
receiver-based schemes, such as subtractive synthesis and sampling, in that it is 
represented in the spectral (frequency) domain rather than the time domain. 

There are many benefits in the use of additive synthesis for sound production 
in computer music applications. These include expressive musical control over fine 
15 timbral distinctions, perceptually relevant parameterizations, sample rate independence 
of timber description, availability of many analysis techniques, high control 
bandwidth, and multiple dimensions for resource allocation/optimization. 

The challenge of the additive synthesis technique is the computational intensity 
of the separately controllable sinusoidal partials. A single low frequency piano note 
20 can require hundreds of time-varying sinusoids for accurate reproduction. Musically 
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BACKGROUND OF THE INVENTION 



effective use of 



ive synthesis in live performance can 



# 



Ire the ability to control 



many hundreds or even thousands of sinusoidal partials in real-time. 

This computational challenge is addressed by resolving two issues: w^hich 
hardware architecture to use and which sinusoid generation algorithm to use on the 
5 selected architecture. Digital Signal Processors or vector processors are a good 

selection for the data type and associated computational demands. Unfortunately, such 
architectures do not always support full-range (i.e., floating-point) arithmetic; fixed 
point may be all that is provided. There is always a large demand for low-cost 
implementations. Therefore, it is desirable to be able to exploit a relatively 
10 inexpensive, moderate-precision arithmetic hardware architecture, such as a 16-bit 
processor. 

A number of sinusoidal partial production techniques may be used on a 
selected hardware architecture. These techniques can be placed in three classes: those 
that implement recursive filters, those using table-lookup, or those that work in the 

15 transform-domain using techniques, such as the inverse fast fourier transform. The 
transform-domain approach is most advantageous for applications requiring many 
sinusoids and for which some error in phase and amplitude and some latency is 
acceptable. The lookup technique is the most widely used for applications requiring a 
few sinusoids at a very high data rate, such as radio firequency communications. 

20 Recursive oscillators have several advantages, including the inherent fine-grain 
exposure of data parallelism, the far more limited demand on the memory system 
compared to table look-ups, the lower induced latency than v^th a transform-domain 
approach, the latency flexibility, and/or the attainable phase accuracy. 



25 stability as rounding and truncation errors accumulate. Another problem with 
recursive oscillators is providing sufficient frequency coefficient resolution. 

In view of the foregoing, it would be highly desirable to provide an improved 
technique for processing real-time partials on a general purpose hardware architecture. 
Ideally, the technique could be readily implemented on a moderate-precision 

30 arithmetic hardware architecture, such as a 16-bit processor. The technique should 
address the problem of error accumulation inherent in recursive methods. In addition, 
the technique should provide sufficient fi-equency coefficient resolution. 
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The primary problem vAth digital recursive oscillators is managing long-term 



SUMMARY OF THE INVENT! 

The method of the invention is directed toward performing additive synthesis 
of digital audio signals with a recursive digital oscillator. The method includes the 
step of receiving digital audio signal frames wherein each digital audio signal frame 
5 includes a set of frequency, amplitude, and phase components represented as 

coefficients of variables in a mathematical expression. Each digital audio signal frame 
thereby includes a frequency coefiRcient representation. Converted frequency 
coefficients are formed by linearly re-mapping bits of the frequency coefficient 
representation to bias audio reproduction accuracy toward low frequency signals. 

10 Additive synthesis is then performed with the converted frequency coefficients. 

The method of the invention also includes receiving digital audio signal frames 
wherein each digital audio signal frame includes a set of frequency, amplitude, and 
phase components, represented as coefficients in the standard mathematical expression 
of the Fourier theorem; the step of converting frequency components of each digital 

1 5 audio signal frame to bias reproduction accuracy toward lower frequencies in the audio 
spectrum through the use of a re-mapping of the bits of the component and through the 
addition of a range-extending shift amount; and the step of performing additive 
synthesis via the use of an efficient recursive digital oscillator structure that uses the 
converted frequency coefficients intemally. 

20 The apparatus of the invention includes a computer readable memory to direct 

a processor to fimction in a specified manner. The computer readable memory includes 
a first set of executable instructions to receive digital audio signal frames wherein each 
digital audio signal frame has a set of specified frequency values expressed as a bit 
sequence. A second set of executable instructions transforms the bit sequence to 

25 represent lower frequencies with more significant bits and higher frequencies with less 
significant bits. A third set of executable instructions facilitates additive synthesis of 
the digital audio signal frames in a reduced-precision recursive digital oscillator. 
Sound is produced as multiple recursive oscillators operate in parallel. 

The invention provides an improved technique for real-time production of 

30 summed variable-frequency sinusoids on a general purpose hardware architecture. 
The technique is readily implemented on a moderate-precision arithmetic hardware 
architecture, such as a 16-bit processor, but is also successfully implemented on a 
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variety of hard 




chitectures. The technique of the 



in^ 




►n addresses the 



problem of error accumulation inherent in recursive oscillation, the problem of 
providing adequate frequency coefficient resolution inside individual oscillators, and 
the problem of providing computationally efficient additive synthesis on a variety of 
hardware platforms. 



For a better understanding of the invention, reference should be made to the 
foUoMdng detailed description taken in conjunction v^th the accompanying drav^ngs, 
in which: 

FIGURE 1 illustrates an apparatus for implementing an embodiment of the 
invention. 

FIGURE 2 illustrates an embodiment of the invention in the context of an 
analysis/re-synthesis framework, and identifies the processes in the firamework that 
require real-time performance. 

FIGURE 3 illustrates overlapping audio frames processed in accordance with 
an embodiment of the invention. 

FIGURE 4 illustrates the partitioning of a theta term into alpha and beta 
components in accordance with an embodiment of the invention. 

FIGURE 5 illustrates a comparison of original absolute error due to coefficient 
quantization error, and the modified error achieved in accordance with an embodiment 
of the invention. 

FIGURE 6 is a detailed illustration of the modified absolute error due to 
coefficient quantization error achieved in accordance with an embodiment of the 
invention. 

Like reference numerals refer to corresponding parts throughout the drawdngs. 



Figure 1 illustrates an apparatus 20 that may be used to implement an 
embodiment of the invention. The apparatus 20 includes the components associated 
with a general purpose computer. In particular, the apparatus 20 includes a processor 
22, many variations of which are discussed below. The processor 22 is connected to a 



BRIEF DESCRIPTION OF THE DRAWINGS 



DETAILED DESCRIPTION OF THE INVENTION 
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set 



of input/out] 




;vices 24 via a 



bus 26. The input/out] 




ivices 24 may include 



such components £is a keyboard, mouse, speakers, video monitor, and the like. 

A memory (primary and/or secondary) 28 is connected to the bus 26. The 
memory 28 stores a set of executable instructions used to implement the processing of 
5 the invention. In particular, the memory 28 stores a non-real-time processing module 
to perform prior art processing of the type described below. In accordance with the 
invention, the memory 28 also stores a frequency coefficient conversion module 32. 
As discussed below, the frequency coefficient conversion module 32 re-maps bits of a 
frequency coefficient representation to bias audio reproduction accuracy at the 

10 input/output devices 24 toward low frequency signals. An additive synthesizer 34 

built using a new formulation of a prior art recursive oscillation technique is then used 
to process the linearly re-mapped bits of the frequency coefficient representation. For 
the purpose of convenience, the invention is frequently described in the context of a 
single recursive oscillator. This reference to a single recursive oscillator contemplates 

15 the use of multiple recursive oscillators operating in parallel to produce sound, as 
understood from the following discussion. 

The invention is directed toward the frequency coefficient conversion module 
32 and the additive synthesizer 34, which efficiently creates sound based on the output 
of the conversion module 32. The context in which this module operates and the 

20 operations that it performs are more fiilly appreciated with reference to Figure 2. 
Figure 2 illustrates an example of a complete additive analysis/synthesis system 
framework. The steps to the right of the thick dashed line 50 are computed in real- 
time by the frequency coefficient conversion module 32 and the additive synthesizer 
34. The steps to the left of the line 50 are performed by the non-real-time processing 

25 module 30. 

The concept behind this particular separation is that a set of sound primitives, 
expressed as sets of overlap-add frames, called timbral prototypes, can be generated 
off-line via the non-real-time steps as part of the compositional process. Then at 
performance time, sets of timbral prototypes are loaded into and out of memory 28 
30 according to a score, where they can be manipulated and combined in response to 

controller input from a performer operating the input/output devices 24. The modified 
frames are then synthesized in real-time for subsequent audition. The use of such a 
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paradigm enabl^J^itional degrees of freedom in perfornl^^ than available 
through, for example, conventional sample-playback-based synthesis. 

The processing associated with the present invention is directed toward the 
final step, that of taking a set of dynamically changing frames and synthesizing them 
5 into audio samples. The challenge of using a vector instruction set architecture is 
explicitly managing parallelism due to the independence of sinusoid computations. 
The technique of the invention exploits the natural coarse-grained parallelism by 
choosing to stripe state variables of sinusoids across the length of the vectors. The 
technique of the invention allows for implementation on a moderate-precision 
10 arithmetic unit (e.g., a 16-bit processor) using moderate-precision numeric 

representations. In particular, the invention provides sufficient frequency coefficient 
resolution by modifying a standard recursive form. The technique also reduces 
quantization-induced noise effects by keeping oscillators short-lived in order to exploit 
short-term fidelity. 

15 The input to the frequency coefficient conversion module 32 is a series of 

variable-length overlap-add frames. A succession of such frames constitute a timbral 
prototype, which is either synthetically designed or derived through a separate analysis 
phase, as depicted in Figure 2. The analysis phsise may include the generation of a 
sound (block 60) from which spectral estimation is used to produce a set of fast 

20 Fourier transforms (block 62). Pitch is then detected to produce a set of pitch 
estimates (block 64). The pitch estimates are then used to identify new window 
lengths associated v^th the spectral estimation. This results in a new set of fast 
Fourier transforms (block 66). Peak detection is performed for the new fast Fourier 
transforms to produce new peak estimates (block 68). Smoothing is then performed 

25 on the peaks and an overlap-add frame operation (block 70) is then initiated. The 

frequency coefficient and conversion module 32 then performs coefficient re-mapping 
and frame stitching (overlap-add frames) (block 72), as discussed below. Sound 80 
corresponding to the initial sound (block 60) may then be produced with an additive 
synthesizer 34 based on the recursive oscillator structure described below. 

30 Each frame consists of a frame header and frame data. The frame header is a 

double-precision floating point time stamp denoting the start time of the frame and an 
integer denoting the number of partials in it. The frame data is a list containing the 
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fixed frequenc>^^Mc amplitude, and initial phase for eac 1 





Lsoid in the frame, all in 



single-precision floating point. 

At any instant of time, a timbral prototype is being synthesized as a weighted 
sum of two constituent frames. Each of the two sets of frame data are synthesized at a 
5 constant frequency and phase. Irrespective of their timestamps, successive frames are 
50% overlapped vnth individual amplitude envelopes linearly increasing from zero to 
the specified peak amplitude value for the first overlapped portion of the frame, and 
linearly decreasing from this peak back to zero during the second portion of the frame. 
This is illustrated in Figure 3. The two sets of scaled, overlapped frame partials are 

1 0 summed to constitute an output channel. 

An important feature of this approach is that for individual generating 
oscillators, the frequency, phase, and amplitude remain constant. By overlapping and 
adding successive oscillators with the triangular amplitude envelope, two fixed- 
frequency, fixed-amplitude sinusoids closely approximate a single varying-frequency, 

1 5 varying-amplitude partial. 

As previously indicated, there are many ways to generate sinusoids. The most 
common methods include various recursive techniques, table look-up, and transform 
domain methods, such as those using the inverse fast fourier transform. The present 
invention relies upon recursive techniques due to their heavy reliance on explicitly 

20 parallel arithmetic with fewer time-consuming memory accesses. In accordance with 
an embodiment of the invention, the following digital resonator, with no damping or 
initialization impulse fimction, is used: 



with /as the sampling frequency, and / e (0,^/2) as the desired (constant) frequency 
25 of oscillation. 

To implement this equation using only sixteen-bit fixed-point multiplies, it is 
necessary to (1) manage the fixed-point units with enough precision to maintain 
accuracy across the entire audible frequency range, while (2) taking special care to 
provide sufficient frequency coefficient resolution to account for human ability to 
30 distinguish subtle differences in low frequencies. Accuracy must be maintained across 




9840-0039-999 B96-033-1 



7 



CAl -240318.1 



a broader rangd^J^with more precision for low-frequencj^^ials than a simple 
sixteen-bit fixed-point representation supplies. Additionally, because the frequency 
coefficient multiplication is in the critical path, it is desirable to minimize the 
computational overhead of the changes, 
5 To quantify the issue, the minimum perceptible musical interval is specified. 

Afterwards, the resolution necessary to maintain relative frequency accuracy is 
calculated. Doing so indicates that the low-frequency components require more 
precision than higher ones — which is intuitive, since relative accuracy is being 
calculated. jThus, to minimize perceived error, the frequency coefficient 
10 representations are re-mapped in two ways: by employing an exponent internally to 
/emulate floating-point range extension, and by inverting the bit representation to bias 
accuracy toward low frequencies. These changes require two new operations per filter 
per sample: an add with constant shift and a variable shift. 

To understand the modifications to the filter, recall the original recurrence 

2;rf 

15 relation for the sine wave generator (with co = — — ). At low frequency, the co- 

efficient 2 cos(w) is very close to two, and so in a floating-point format, lower 
frequencies synthesized using the formula will have less accuracy than higher- 
frequencies due to the need to explicitly represent the leading ones in the mantissa. 
Numbers closer to zero benefit from the implicit encoding of leading zeros via a 
20 smaller exponent. In other words, larger values require bits with larger "significance" 
(absolute value) forcing the least significant bits in the same word to also have higher 
significance, thus forcing higher worst-case quantization error. One can more 
effectively use the bits of the mantissa by reversing this relationship, recasting the 
equation as: 



25 



x„ = 2cos(w)x„., -x^2 
x„ = 2(1 -e/2>„., -0:^2 
~ 2x^1 — e x^i — x^2 



30 i.e., where cos(>v) = (1 - e/2). 
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To repr^^j^ e, an unsigned sixteen-bit mantissa //^^^bmbined with an 
unsigned exponent biased so that the actual represented value is e - l}'^m. Thus, the 
exponent is also the right shift amount necessary to correct a 16b x 16b 32b 
multiply with e as an operand. The two in the exponent allows e to range from 0 to 4 
when m is interpreted as a fractional amount and / ranges between zero and the 
Nyquist frequency. 

What is achieved with this re;rf6apping of number representation is the ability 
to represent lower frequencies wifn more significant bits by mapping higher 
frequencies with with less;9i:gnificant digits. In particular, as 2 cos(2ic/ ) varies 
from -2 to 2, € is defiBf^ to vary from 4 to 0. Smaller frequency values produce 
smaller values helping to satisfy asymmetric accuracy requirements of the human 
auditory sv^t^. 

Initialization can be quickly accomplished in accordance with the invention. In 
particular, the resonator can be initialized to a desired frequency and phase at sample 
15 Xq by properly choosing the two state variables x.2 and jc.i using fimction evaluations in 
place of an initialization forcing fimction. The lookup values for a sinusoid with phase 
p and frequency / are: 



20 



2nd 

x_i = sm| p 1 ; jc_2 = sm 



P- 



fs 



These initializations must be accurate down to the low-order bits in a 32-bit fixed 
point representation, with the binary point set between the third and fourth bit 
positions in order to support a phase in the range [0, 271]. In addition, it is necessary to 
compute the frequency coefficient 2-2 cos(6>) to 32-bit accuracy. 
25 These initial evaluations can be computed more quickly by rewriting the 

equations for x.x and x.2 in a form that requires only the computation of sin (p), cos(p), 
sin(6y), and cos(6;'): 

x.i = sin(p - CO) 
30 = sin(p) cos(cj) - cos(p) sin(6>) 

x.2 = %\n{p'-2oi) 
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sin(p) cos(2a>) - cos(p) sin(2<o) 
2 cos(cJ)sm(p -co)- sin (p) 
2 cos(a>)x.i - sin(p) 




5 



It may seem that this has actually increased the amount of work to be 



performed because there are now four trigonometric evaluations rather than three (two 
initialization sines plus the cosine in the recursive form). However, this approach 
tums out to be more efficient by allowing for the judicious sharing of intermediate 
values in a tandem sine and cosine generation procedure. The tandem subroutine 

10 retums both sin(0) and cos(0) for 6e [0,271] to full 32-bit fixed-point precision using a 
hybrid technique combining table-lookup and Taylor expansion. This keeps both the 
table size manageable (2048 entries of 32 bits) and the number of terms in the Taylor 
expansions small (two). It is implemented by separating 0 into a and yff as shown in 
Figure 4; a is the high-order 1 1 bits of 0, and J3 the remaining low-order bits, a is 

15 used in an exact (to one LSB) 1 1-bit 32-bit table-lookups, while (guaranteed small) 
/3 is used in Taylor expansions. 



20 The accuracy of expanding each to only two terms is guaranteed by limiting the size of 
jSto only the low-order 21 bits of 0. the sum of the remaining terms in each expansion 
sequence, for all /?, is less than the LSB. Finally, a and are combined using the 
relationships: 



Attention now tums to an error analysis performed in accordance with an 
embodiment of the invention. Relative fi*equency discrimination is based on the ratio 
30 of adjacent frequencies. To determine worst-case relative error, it is desirable to 
determine the maximum ratio between two adjacent e values. Call these frequency 



cos(>9)«l-^ and 




25 



sm(a+ p) = sin(flr) co%{p) + cos(^) sin(y^ 
cos(flr+ p) ^ cos(^) cosifi) + cos(a) sin(/?) 
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coefficients ^^29 their corresponding fi-equencied 
definition of w and the equation cos(w) = (1 - e /2), 





id fj. From the 



5 



/, _ cos-'(l- e, /2) 
/2 cos-*(l- /2) 



Taking any two adjacent numbers in the range, one can compute /i and f2 with 
the foregoing equation. Evaluating this ratio for all possible adjacent pairs of epsilon 
values allows one to determine that it is maximized for e, = 4 - 2"*"* and 63 = 4 - 2''^, 
10 where ^1.0010337. This ratio is lower than the minimum frequency ratio humans 
are able to differentiate, a pitch difference of approximately four to five cents (about 
1/25-1/20 of a semitone). The maximum error of the algorithm is actually less than 
two cents: ^oo^^ 1 .001 156. 



15 higher represented frequencies, a good match of the numerical representation to the 
asymmetric accuracy requirements of human logarithmic pitch perception is achieved. 

Two tones that are meant to have an exact ratio in their frequencies may 
instead generate beat frequencies due to frequency quantization. This effect, caused 
by absolute error ^ should be minimized. 

20 Worst-case absolute error due to epsilon quantization is shown in Figure 5, 

which contains a side-by-side comparison below 2000 Hz for an original signal 100 
and a modified signal 102. Figure 6 is a more detailed representation of the modified 
signal 102. As expected, the recast filter maintains more precise absolute frequency 
than the original form. 

25 Fundamentally, more than 16 bits of fractional co-efficient are necessary to 

obtain 1 Hz absolute precision across the audible spectrum. The method of the 
invention maintains reasonable error bounds in sixteen bits of mantissa by scaling 
these bits with the exponent. 



30 rounding of the multiply result. Due to the recursion, this error isn't corrected until 

reinitialization of the state variables. Possible effects of this include degradation of the 



This calculation illustrates that by designing higher e values to coincide with 



At each iteration of the recursive form, a small error is introduced due to 
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signal-to-noise^lll^, degradation of the long-term phase a^j^cy, and a lack of 
amplitude stability-all of which can cause audible artifacts. 

These problems can be corrected via additional computations, but to avoid 
additional computations, the invention exploits the ability to reinitialize the 
5 computation at overlap-add frame boundaries, thereby allov^ng use of a non-self- 
correcting (but higher-performance) digital oscillator form described below. 

In one embodiment, the invention was implemented on a neural network and 
signal processing accelerator board. This embodiment included a TO chip, a 16-bit 
fixed point vector arithmetic core developed by the University of Califomia at 

10 Berkeley and the Intemational Computer Science Institute. The TO chip tightly 

couples a general-purpose scalar MIPS core to a high-performance vector coprocessor. 
TO is representative of digital signal processing architectures in its use of fixed-point 
arithmetic. In order to compute sunmiations of oscillators for additive synthesis with 
an overlap-add approach for a pseudo-floating-point format, four multiplies, two 

1 5 variable shifts, two fused (constant) shifts and adds, and two regular adds aire required; 

x„ = 2x„_i — e x^i + 

out, = out, + X x„ 

20 

A coded module implementing the foregoing expressions constitutes an additive 
synthesizer 34 in accordance with the invention. Observe that this additive synthesizer 
34 incorporates prior art components of additive synthesis (the idea of using an 
analysis step followed by a re-synthesis step), and the general approach of using 

25 recursive oscillators. However, the additive synthesizer 34 represents a new 

formulation of prior art recursive oscillation techniques in its use of a modified filter 
equation, its use of modified coefficient representations, and the explicit consideration 
of the human auditory system to provide additional computational efficiency. 

On TO, the implementation of the additive synthesizer 34 requires a total of 9 + 

30 V« Vector arithmetic operations per sinusoid when unrolled n times. Unrolling four 
times due to trade-offs in register file pressure on TO, one achieves best-case 
performance of about 1.15 cycles/partial: two fixed-frequency sinusoids are required 
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per variable-fr^^^^cy partial because of overlap-add, 9 V^jj^ations are required per 
sine, two cycles are required per vector operation on TO, and the vector length is 32 
elements. Thus, in this embodiment, performing 8 operations per cycle (peak) wdth a 
40 MHz clock rate and at a 44. 1 kHz sampling rate, a theoretical maximimi of 768 
5 partials can be achieved in real time excluding all overhead. The current 

implementation supports up to 608 simultaneous real-time partials with frame lengths 
of 5.8 ms or greater, or about 1.5 cycles per partial per sample. 

The invention may also be implemented on a Digital Signal Processor. 
Although Digital Signal Processors typically do not have flexibly configured vector 

10 pipelines, these processors support several different vector operand sizes, including 
single precision floating point, 16-bit and 32-bit fixed point. Operand size and 
coefficient alignment can be exploited according to desired frequency and eimplitude 
of each sinusoidal signal sequence. 

The invention may also be implemented in Field Programmable Gate Arrays 

1 5 (FPGAs). Such processors allow for the creation of new, specialized arithmetic 

operations on a per instruction and per sinusoidal sequence basis. This allows use of 
lattice filter structures and sinusoidal synthesis algorithms, such as quantizers, error 
feedback, and non-linear operations, which arc presently limited to custom hardweire 
processors. 

20 Very Long Instruction Word (VLIW) processors may also be used to 

implement the invention. These processors have multiple concurrent arithmetic units, 
but use long instructions to control them rather than the vector processor's limited, but 
compact vector instructions. Like vector processors, VLIW processors benefit from 
algorithms exhibiting good locality of reference. The simplicity and regularity of the 

25 second order recursive kernels used in this invention allows the VLIW compiler to 
efficiently map the algorithm to a particular VLIW processor and more importantly 
allows for effective code generation in applications where other algorithms are 
performed concurrently vnth the sinusoidal models, such as the high level parametric 
control structures for models. 

30 The invention may also be implemented in RISC processors. Performance of 

these processors depends on instruction order and cache utilization, both of which can 
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be optimized oi 




basis of desired frequency and phase ^^pbst accurately and 



efficiently compute the approximating sinusoidal sequences. 

Since transform domain methods also work well for these superscaler RISC 
processors, it is necessary to consider further advantages of the present invention over 
5 transform domain methods. The first advantage is computational: since in this 
invention the sinusoids are computed directly and individually, no cost for a final 
transform is incurred. This cost is especially significant when multiple independent 
channels of summed sinusoids are required since a transform is required for each 
channel. Transform domain methods cannot output elements of the output sequence 

10 until the entire transform is performed. The resulting latency is avoided in this 
invention because each element of the sinusoidal sequence may be stitched to its 
predecessor sequence and be driven as output as soon as it is computed. 

Another advantage of the invention is in connection with cache memory 
utilization. This invention does not require a tabulated frequency domain window 

1 5 function at all and the triangular window function it does require for the stitching need 
not be tabulated as it may be computed with sufficient accuracy by accumulation. 
This invention therefore affords a straight-forward implementation of the v^ndow 
stitching operations for any sequence length. Transform domain methods favor 
window sizes which are powers of 2 or 3 and require considerable complexity to 

20 dynamically change window sizes. 

The invention may also be implemented on processors using a Residue 
Number System. These processors are not widely deployed because of the high cost 
of conversion of numbers from traditional 2's complement representation. This 
problem is largely avoided with this invention since only the coefficients need to be 

25 converted for each sinusoidal sequence. The sequences themselves can be efficiently 
computed using Residue Number System arithmetic. 

The invention may also be implemented on processors with a complex 
arithmetic kemel. Such processors efficiently implement a vector rotation as a single 
complex multiply. If the norm of a constant multiplicand is set to unity, a first-order, 

30 complex vector rotation is mathematically equivalent to a second order real coefficient 
system. In practice, the complex arithmetic kemel may be superior because it exhibits 
smaller quantization errors. 
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[g description, for purposes of explanal 



Pused specific 



nomenclature to provide a thorough understanding of the invention. However, it will 
be apparent to one skilled in the art that the specific details are not required in order to 
practice the invention. In other instances, well known circuits and devices are shown 
5 in block diagram form in order to avoid imnecessary distraction fi-om the imderlying 
invention. Thus, the foregoing descriptions of specific embodiments of the present 
invention are presented for purposes of illustration and description. They are not 
intended to be exhaustive or to limit the invention to the precise forms disclosed, 
obviously many modifications and variations are possible in . view of the above 
10 teachings. The embodiments were chosen and described in order to best explain the 
principles of the invention and its practical applications, to thereby enable others 
skilled in the art to best utilize the invention and various embodiments with various 
modifications as are suited to the particular use contemplated. It is intended that the 
scope of the invention be defined by the following claims and their equivalents. 
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